Dataset statistics
| Number of variables | 28 |
|---|---|
| Number of observations | 119390 |
| Missing cells | 129425 |
| Missing cells (%) | 3.9% |
| Duplicate rows | 8237 |
| Duplicate rows (%) | 6.9% |
| Total size in memory | 24.7 MiB |
| Average record size in memory | 217.0 B |
Variable types
| Categorical | 12 |
|---|---|
| Numeric | 14 |
| Boolean | 1 |
| Text | 1 |
| Dataset has 8237 (6.9%) duplicate rows | Duplicates |
id_travel_agency_booking is highly overall correlated with type | High correlation |
type is highly overall correlated with id_travel_agency_booking | High correlation |
num_children is highly imbalanced (80.7%) | Imbalance |
num_babies is highly imbalanced (97.2%) | Imbalance |
distribution_channel is highly imbalanced (63.2%) | Imbalance |
repeated_guest is highly imbalanced (79.6%) | Imbalance |
reserved_room is highly imbalanced (58.3%) | Imbalance |
deposit_policy is highly imbalanced (65.3%) | Imbalance |
customer_type is highly imbalanced (50.6%) | Imbalance |
required_car_parking_spaces is highly imbalanced (85.4%) | Imbalance |
id_travel_agency_booking has 16340 (13.7%) missing values | Missing |
id_person_booking has 112593 (94.3%) missing values | Missing |
num_previous_cancellations is highly skewed (γ1 = 24.45804872) | Skewed |
num_previous_stays is highly skewed (γ1 = 23.53979995) | Skewed |
days_between_booking_arrival has 6345 (5.3%) zeros | Zeros |
num_weekend_nights has 51998 (43.6%) zeros | Zeros |
num_workweek_nights has 7645 (6.4%) zeros | Zeros |
market_segment has 12606 (10.6%) zeros | Zeros |
num_previous_cancellations has 112906 (94.6%) zeros | Zeros |
num_previous_stays has 115770 (97.0%) zeros | Zeros |
changes_between_booking_arrival has 101314 (84.9%) zeros | Zeros |
avg_price has 1960 (1.6%) zeros | Zeros |
total_of_special_requests has 70318 (58.9%) zeros | Zeros |
Reproduction
| Analysis started | 2024-02-06 10:33:37.145233 |
|---|---|
| Analysis finished | 2024-02-06 10:35:12.705780 |
| Duration | 1 minute and 35.56 seconds |
| Software version | ydata-profiling v0.0.dev0 |
| Download configuration | config.json |
cancellation
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 119390 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 75166 | |
| 1 | 44224 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 75166 | |
| 1 | 44224 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 75166 | |
| 1 | 44224 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 119390 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 75166 | |
| 1 | 44224 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 119390 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 75166 | |
| 1 | 44224 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 119390 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 75166 | |
| 1 | 44224 |
type
Categorical
HIGH CORRELATION 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| Hotel | |
|---|---|
| Fancy Hotel |
Length
| Max length | 11 |
|---|---|
| Median length | 5 |
| Mean length | 7.0132339 |
| Min length | 5 |
Characters and Unicode
| Total characters | 837310 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Fancy Hotel |
|---|---|
| 2nd row | Fancy Hotel |
| 3rd row | Fancy Hotel |
| 4th row | Fancy Hotel |
| 5th row | Fancy Hotel |
Common Values
| Value | Count | Frequency (%) |
| Hotel | 79330 | |
| Fancy Hotel | 40060 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| hotel | 119390 | |
| fancy | 40060 | 25.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| H | 119390 | |
| o | 119390 | |
| t | 119390 | |
| e | 119390 | |
| l | 119390 | |
| F | 40060 | 4.8% |
| a | 40060 | 4.8% |
| n | 40060 | 4.8% |
| c | 40060 | 4.8% |
| y | 40060 | 4.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 637800 | |
| Uppercase Letter | 159450 | 19.0% |
| Space Separator | 40060 | 4.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 119390 | |
| t | 119390 | |
| e | 119390 | |
| l | 119390 | |
| a | 40060 | 6.3% |
| n | 40060 | 6.3% |
| c | 40060 | 6.3% |
| y | 40060 | 6.3% |
Uppercase Letter
| Value | Count | Frequency (%) |
| H | 119390 | |
| F | 40060 | 25.1% |
Space Separator
| Value | Count | Frequency (%) |
| 40060 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 797250 | |
| Common | 40060 | 4.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| H | 119390 | |
| o | 119390 | |
| t | 119390 | |
| e | 119390 | |
| l | 119390 | |
| F | 40060 | 5.0% |
| a | 40060 | 5.0% |
| n | 40060 | 5.0% |
| c | 40060 | 5.0% |
| y | 40060 | 5.0% |
Common
| Value | Count | Frequency (%) |
| 40060 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 837310 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| H | 119390 | |
| o | 119390 | |
| t | 119390 | |
| e | 119390 | |
| l | 119390 | |
| F | 40060 | 4.8% |
| a | 40060 | 4.8% |
| n | 40060 | 4.8% |
| c | 40060 | 4.8% |
| y | 40060 | 4.8% |
days_between_booking_arrival
Real number (ℝ)
ZEROS 
| Distinct | 479 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 104.01142 |
| Minimum | 0 |
|---|---|
| Maximum | 737 |
| Zeros | 6345 |
| Zeros (%) | 5.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 18 |
| median | 69 |
| Q3 | 160 |
| 95-th percentile | 320 |
| Maximum | 737 |
| Range | 737 |
| Interquartile range (IQR) | 142 |
Descriptive statistics
| Standard deviation | 106.8631 |
|---|---|
| Coefficient of variation (CV) | 1.027417 |
| Kurtosis | 1.6964488 |
| Mean | 104.01142 |
| Median Absolute Deviation (MAD) | 60 |
| Skewness | 1.3465499 |
| Sum | 12417923 |
| Variance | 11419.722 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 6345 | 5.3% |
| 1 | 3460 | 2.9% |
| 2 | 2069 | 1.7% |
| 3 | 1816 | 1.5% |
| 4 | 1715 | 1.4% |
| 5 | 1565 | 1.3% |
| 6 | 1445 | 1.2% |
| 7 | 1331 | 1.1% |
| 8 | 1138 | 1.0% |
| 12 | 1079 | 0.9% |
| Other values (469) | 97427 |
| Value | Count | Frequency (%) |
| 0 | 6345 | |
| 1 | 3460 | |
| 2 | 2069 | 1.7% |
| 3 | 1816 | 1.5% |
| 4 | 1715 | 1.4% |
| 5 | 1565 | 1.3% |
| 6 | 1445 | 1.2% |
| 7 | 1331 | 1.1% |
| 8 | 1138 | 1.0% |
| 9 | 992 | 0.8% |
| Value | Count | Frequency (%) |
| 737 | 1 | < 0.1% |
| 709 | 1 | < 0.1% |
| 629 | 17 | |
| 626 | 30 | |
| 622 | 17 | |
| 615 | 17 | |
| 608 | 17 | |
| 605 | 30 | |
| 601 | 17 | |
| 594 | 17 |
year_arrival_date
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| 2016 | |
|---|---|
| 2017 | |
| 2015 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 477560 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2015 |
|---|---|
| 2nd row | 2015 |
| 3rd row | 2015 |
| 4th row | 2015 |
| 5th row | 2015 |
Common Values
| Value | Count | Frequency (%) |
| 2016 | 56707 | |
| 2017 | 40687 | |
| 2015 | 21996 | 18.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2016 | 56707 | |
| 2017 | 40687 | |
| 2015 | 21996 | 18.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 119390 | |
| 0 | 119390 | |
| 1 | 119390 | |
| 6 | 56707 | |
| 7 | 40687 | 8.5% |
| 5 | 21996 | 4.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 477560 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 119390 | |
| 0 | 119390 | |
| 1 | 119390 | |
| 6 | 56707 | |
| 7 | 40687 | 8.5% |
| 5 | 21996 | 4.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 477560 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 119390 | |
| 0 | 119390 | |
| 1 | 119390 | |
| 6 | 56707 | |
| 7 | 40687 | 8.5% |
| 5 | 21996 | 4.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 477560 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 119390 | |
| 0 | 119390 | |
| 1 | 119390 | |
| 6 | 56707 | |
| 7 | 40687 | 8.5% |
| 5 | 21996 | 4.6% |
month_arrival_date
Categorical
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| August | |
|---|---|
| July | |
| May | |
| October | |
| April | |
| Other values (7) |
Length
| Max length | 9 |
|---|---|
| Median length | 7 |
| Mean length | 5.9031828 |
| Min length | 3 |
Characters and Unicode
| Total characters | 704781 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | July |
|---|---|
| 2nd row | July |
| 3rd row | July |
| 4th row | July |
| 5th row | July |
Common Values
| Value | Count | Frequency (%) |
| August | 13877 | |
| July | 12661 | |
| May | 11791 | |
| October | 11160 | |
| April | 11089 | |
| June | 10939 | |
| September | 10508 | |
| March | 9794 | |
| February | 8068 | |
| November | 6794 | |
| Other values (2) | 12709 |
Length
| Value | Count | Frequency (%) |
| august | 13877 | |
| july | 12661 | |
| may | 11791 | |
| october | 11160 | |
| april | 11089 | |
| june | 10939 | |
| september | 10508 | |
| march | 9794 | |
| february | 8068 | |
| november | 6794 | |
| Other values (2) | 12709 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 95619 | |
| r | 78190 | 11.1% |
| u | 65351 | 9.3% |
| b | 43310 | 6.1% |
| a | 41511 | 5.9% |
| y | 38449 | 5.5% |
| t | 35545 | 5.0% |
| J | 29529 | 4.2% |
| c | 27734 | 3.9% |
| A | 24966 | 3.5% |
| Other values (16) | 224577 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 585391 | |
| Uppercase Letter | 119390 | 16.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 95619 | |
| r | 78190 | |
| u | 65351 | |
| b | 43310 | 7.4% |
| a | 41511 | 7.1% |
| y | 38449 | 6.6% |
| t | 35545 | 6.1% |
| c | 27734 | 4.7% |
| m | 24082 | 4.1% |
| l | 23750 | 4.1% |
| Other values (8) | 111850 |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 29529 | |
| A | 24966 | |
| M | 21585 | |
| O | 11160 | 9.3% |
| S | 10508 | 8.8% |
| F | 8068 | 6.8% |
| N | 6794 | 5.7% |
| D | 6780 | 5.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 704781 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 95619 | |
| r | 78190 | 11.1% |
| u | 65351 | 9.3% |
| b | 43310 | 6.1% |
| a | 41511 | 5.9% |
| y | 38449 | 5.5% |
| t | 35545 | 5.0% |
| J | 29529 | 4.2% |
| c | 27734 | 3.9% |
| A | 24966 | 3.5% |
| Other values (16) | 224577 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 704781 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 95619 | |
| r | 78190 | 11.1% |
| u | 65351 | 9.3% |
| b | 43310 | 6.1% |
| a | 41511 | 5.9% |
| y | 38449 | 5.5% |
| t | 35545 | 5.0% |
| J | 29529 | 4.2% |
| c | 27734 | 3.9% |
| A | 24966 | 3.5% |
| Other values (16) | 224577 |
week_number_arrival_date
Real number (ℝ)
| Distinct | 53 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.165173 |
| Minimum | 1 |
|---|---|
| Maximum | 53 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 16 |
| median | 28 |
| Q3 | 38 |
| 95-th percentile | 49 |
| Maximum | 53 |
| Range | 52 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 13.605138 |
|---|---|
| Coefficient of variation (CV) | 0.50083018 |
| Kurtosis | -0.98607718 |
| Mean | 27.165173 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | -0.010014326 |
| Sum | 3243250 |
| Variance | 185.09979 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 33 | 3580 | 3.0% |
| 30 | 3087 | 2.6% |
| 32 | 3045 | 2.6% |
| 34 | 3040 | 2.5% |
| 18 | 2926 | 2.5% |
| 21 | 2854 | 2.4% |
| 28 | 2853 | 2.4% |
| 17 | 2805 | 2.3% |
| 20 | 2785 | 2.3% |
| 29 | 2763 | 2.3% |
| Other values (43) | 89652 |
| Value | Count | Frequency (%) |
| 1 | 1047 | |
| 2 | 1218 | |
| 3 | 1319 | |
| 4 | 1487 | |
| 5 | 1387 | |
| 6 | 1508 | |
| 7 | 2109 | |
| 8 | 2216 | |
| 9 | 2117 | |
| 10 | 2149 |
| Value | Count | Frequency (%) |
| 53 | 1816 | |
| 52 | 1195 | |
| 51 | 933 | |
| 50 | 1505 | |
| 49 | 1782 | |
| 48 | 1504 | |
| 47 | 1685 | |
| 46 | 1574 | |
| 45 | 1941 | |
| 44 | 2272 |
day_of_month_arrival_date
Real number (ℝ)
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.798241 |
| Minimum | 1 |
|---|---|
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 8 |
| median | 16 |
| Q3 | 23 |
| 95-th percentile | 30 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 8.7808295 |
|---|---|
| Coefficient of variation (CV) | 0.55581058 |
| Kurtosis | -1.1871683 |
| Mean | 15.798241 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | -0.002000454 |
| Sum | 1886152 |
| Variance | 77.102966 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 17 | 4406 | 3.7% |
| 5 | 4317 | 3.6% |
| 15 | 4196 | 3.5% |
| 25 | 4160 | 3.5% |
| 26 | 4147 | 3.5% |
| 9 | 4096 | 3.4% |
| 12 | 4087 | 3.4% |
| 16 | 4078 | 3.4% |
| 2 | 4055 | 3.4% |
| 19 | 4052 | 3.4% |
| Other values (21) | 77796 |
| Value | Count | Frequency (%) |
| 1 | 3626 | |
| 2 | 4055 | |
| 3 | 3855 | |
| 4 | 3763 | |
| 5 | 4317 | |
| 6 | 3833 | |
| 7 | 3665 | |
| 8 | 3921 | |
| 9 | 4096 | |
| 10 | 3575 |
| Value | Count | Frequency (%) |
| 31 | 2208 | |
| 30 | 3853 | |
| 29 | 3580 | |
| 28 | 3946 | |
| 27 | 3802 | |
| 26 | 4147 | |
| 25 | 4160 | |
| 24 | 3993 | |
| 23 | 3616 | |
| 22 | 3596 |
num_weekend_nights
Real number (ℝ)
ZEROS 
| Distinct | 17 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.92759863 |
| Minimum | 0 |
|---|---|
| Maximum | 19 |
| Zeros | 51998 |
| Zeros (%) | 43.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 2 |
| Maximum | 19 |
| Range | 19 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 0.99861349 |
|---|---|
| Coefficient of variation (CV) | 1.0765578 |
| Kurtosis | 7.1740661 |
| Mean | 0.92759863 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.3800464 |
| Sum | 110746 |
| Variance | 0.99722891 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 51998 | |
| 2 | 33308 | |
| 1 | 30626 | |
| 4 | 1855 | 1.6% |
| 3 | 1259 | 1.1% |
| 6 | 153 | 0.1% |
| 5 | 79 | 0.1% |
| 8 | 60 | 0.1% |
| 7 | 19 | < 0.1% |
| 9 | 11 | < 0.1% |
| Other values (7) | 22 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 51998 | |
| 1 | 30626 | |
| 2 | 33308 | |
| 3 | 1259 | 1.1% |
| 4 | 1855 | 1.6% |
| 5 | 79 | 0.1% |
| 6 | 153 | 0.1% |
| 7 | 19 | < 0.1% |
| 8 | 60 | 0.1% |
| 9 | 11 | < 0.1% |
| Value | Count | Frequency (%) |
| 19 | 1 | < 0.1% |
| 18 | 1 | < 0.1% |
| 16 | 3 | < 0.1% |
| 14 | 2 | < 0.1% |
| 13 | 3 | < 0.1% |
| 12 | 5 | < 0.1% |
| 10 | 7 | < 0.1% |
| 9 | 11 | < 0.1% |
| 8 | 60 | |
| 7 | 19 | < 0.1% |
num_workweek_nights
Real number (ℝ)
ZEROS 
| Distinct | 35 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.5003015 |
| Minimum | 0 |
|---|---|
| Maximum | 50 |
| Zeros | 7645 |
| Zeros (%) | 6.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 5 |
| Maximum | 50 |
| Range | 50 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.9082856 |
|---|---|
| Coefficient of variation (CV) | 0.76322219 |
| Kurtosis | 24.284555 |
| Mean | 2.5003015 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 2.8622492 |
| Sum | 298511 |
| Variance | 3.641554 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 33684 | |
| 1 | 30310 | |
| 3 | 22258 | |
| 5 | 11077 | 9.3% |
| 4 | 9563 | 8.0% |
| 0 | 7645 | 6.4% |
| 6 | 1499 | 1.3% |
| 10 | 1036 | 0.9% |
| 7 | 1029 | 0.9% |
| 8 | 656 | 0.5% |
| Other values (25) | 633 | 0.5% |
| Value | Count | Frequency (%) |
| 0 | 7645 | 6.4% |
| 1 | 30310 | |
| 2 | 33684 | |
| 3 | 22258 | |
| 4 | 9563 | 8.0% |
| 5 | 11077 | 9.3% |
| 6 | 1499 | 1.3% |
| 7 | 1029 | 0.9% |
| 8 | 656 | 0.5% |
| 9 | 231 | 0.2% |
| Value | Count | Frequency (%) |
| 50 | 1 | < 0.1% |
| 42 | 1 | < 0.1% |
| 41 | 1 | < 0.1% |
| 40 | 2 | < 0.1% |
| 35 | 1 | < 0.1% |
| 34 | 1 | < 0.1% |
| 33 | 1 | < 0.1% |
| 32 | 1 | < 0.1% |
| 30 | 5 | |
| 26 | 1 | < 0.1% |
num_adults
Real number (ℝ)
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.8564034 |
| Minimum | 0 |
|---|---|
| Maximum | 55 |
| Zeros | 403 |
| Zeros (%) | 0.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 3 |
| Maximum | 55 |
| Range | 55 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.579261 |
|---|---|
| Coefficient of variation (CV) | 0.31203401 |
| Kurtosis | 1352.1151 |
| Mean | 1.8564034 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 18.317805 |
| Sum | 221636 |
| Variance | 0.3355433 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 89680 | |
| 1 | 23027 | 19.3% |
| 3 | 6202 | 5.2% |
| 0 | 403 | 0.3% |
| 4 | 62 | 0.1% |
| 26 | 5 | < 0.1% |
| 27 | 2 | < 0.1% |
| 20 | 2 | < 0.1% |
| 5 | 2 | < 0.1% |
| 40 | 1 | < 0.1% |
| Other values (4) | 4 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 403 | 0.3% |
| 1 | 23027 | 19.3% |
| 2 | 89680 | |
| 3 | 6202 | 5.2% |
| 4 | 62 | 0.1% |
| 5 | 2 | < 0.1% |
| 6 | 1 | < 0.1% |
| 10 | 1 | < 0.1% |
| 20 | 2 | < 0.1% |
| 26 | 5 | < 0.1% |
| Value | Count | Frequency (%) |
| 55 | 1 | < 0.1% |
| 50 | 1 | < 0.1% |
| 40 | 1 | < 0.1% |
| 27 | 2 | < 0.1% |
| 26 | 5 | < 0.1% |
| 20 | 2 | < 0.1% |
| 10 | 1 | < 0.1% |
| 6 | 1 | < 0.1% |
| 5 | 2 | < 0.1% |
| 4 | 62 |
num_children
Categorical
IMBALANCE 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 4 |
| Missing (%) | < 0.1% |
| Memory size | 932.9 KiB |
| 0.0 | |
|---|---|
| 1.0 | 4861 |
| 2.0 | 3652 |
| 3.0 | 76 |
| 10.0 | 1 |
Length
| Max length | 4 |
|---|---|
| Median length | 3 |
| Mean length | 3.0000084 |
| Min length | 3 |
Characters and Unicode
| Total characters | 358159 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 110796 | |
| 1.0 | 4861 | 4.1% |
| 2.0 | 3652 | 3.1% |
| 3.0 | 76 | 0.1% |
| 10.0 | 1 | < 0.1% |
| (Missing) | 4 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0.0 | 110796 | |
| 1.0 | 4861 | 4.1% |
| 2.0 | 3652 | 3.1% |
| 3.0 | 76 | 0.1% |
| 10.0 | 1 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 230183 | |
| . | 119386 | |
| 1 | 4862 | 1.4% |
| 2 | 3652 | 1.0% |
| 3 | 76 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 238773 | |
| Other Punctuation | 119386 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 230183 | |
| 1 | 4862 | 2.0% |
| 2 | 3652 | 1.5% |
| 3 | 76 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 119386 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 358159 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 230183 | |
| . | 119386 | |
| 1 | 4862 | 1.4% |
| 2 | 3652 | 1.0% |
| 3 | 76 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 358159 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 230183 | |
| . | 119386 | |
| 1 | 4862 | 1.4% |
| 2 | 3652 | 1.0% |
| 3 | 76 | < 0.1% |
num_babies
Categorical
IMBALANCE 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| 0 | |
|---|---|
| 1 | 900 |
| 2 | 15 |
| 10 | 1 |
| 9 | 1 |
Length
| Max length | 2 |
|---|---|
| Median length | 1 |
| Mean length | 1.0000084 |
| Min length | 1 |
Characters and Unicode
| Total characters | 119391 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 118473 | |
| 1 | 900 | 0.8% |
| 2 | 15 | < 0.1% |
| 10 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 118473 | |
| 1 | 900 | 0.8% |
| 2 | 15 | < 0.1% |
| 10 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 118474 | |
| 1 | 901 | 0.8% |
| 2 | 15 | < 0.1% |
| 9 | 1 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 119391 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 118474 | |
| 1 | 901 | 0.8% |
| 2 | 15 | < 0.1% |
| 9 | 1 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 119391 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 118474 | |
| 1 | 901 | 0.8% |
| 2 | 15 | < 0.1% |
| 9 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 119391 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 118474 | |
| 1 | 901 | 0.8% |
| 2 | 15 | < 0.1% |
| 9 | 1 | < 0.1% |
breakfast
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 116.7 KiB |
| True | |
|---|---|
| False |
| Value | Count | Frequency (%) |
| True | 92310 | |
| False | 27080 | 22.7% |
country
Text
| Distinct | 177 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 488 |
| Missing (%) | 0.4% |
| Memory size | 932.9 KiB |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 2.9892432 |
| Min length | 2 |
Characters and Unicode
| Total characters | 355427 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 30 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | PRT |
|---|---|
| 2nd row | PRT |
| 3rd row | GBR |
| 4th row | GBR |
| 5th row | GBR |
| Value | Count | Frequency (%) |
| prt | 48590 | |
| gbr | 12129 | 10.2% |
| fra | 10415 | 8.8% |
| esp | 8568 | 7.2% |
| deu | 7287 | 6.1% |
| ita | 3766 | 3.2% |
| irl | 3375 | 2.8% |
| bel | 2342 | 2.0% |
| bra | 2224 | 1.9% |
| nld | 2104 | 1.8% |
| Other values (167) | 18102 | 15.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| R | 80804 | |
| P | 58506 | |
| T | 54263 | |
| A | 21627 | 6.1% |
| E | 21538 | 6.1% |
| B | 17051 | 4.8% |
| S | 13931 | 3.9% |
| U | 13293 | 3.7% |
| G | 13130 | 3.7% |
| F | 10956 | 3.1% |
| Other values (16) | 50328 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 355427 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 80804 | |
| P | 58506 | |
| T | 54263 | |
| A | 21627 | 6.1% |
| E | 21538 | 6.1% |
| B | 17051 | 4.8% |
| S | 13931 | 3.9% |
| U | 13293 | 3.7% |
| G | 13130 | 3.7% |
| F | 10956 | 3.1% |
| Other values (16) | 50328 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 355427 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| R | 80804 | |
| P | 58506 | |
| T | 54263 | |
| A | 21627 | 6.1% |
| E | 21538 | 6.1% |
| B | 17051 | 4.8% |
| S | 13931 | 3.9% |
| U | 13293 | 3.7% |
| G | 13130 | 3.7% |
| F | 10956 | 3.1% |
| Other values (16) | 50328 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 355427 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| R | 80804 | |
| P | 58506 | |
| T | 54263 | |
| A | 21627 | 6.1% |
| E | 21538 | 6.1% |
| B | 17051 | 4.8% |
| S | 13931 | 3.9% |
| U | 13293 | 3.7% |
| G | 13130 | 3.7% |
| F | 10956 | 3.1% |
| Other values (16) | 50328 |
market_segment
Real number (ℝ)
ZEROS 
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.4675768 |
| Minimum | 0 |
|---|---|
| Maximum | 7 |
| Zeros | 12606 |
| Zeros (%) | 10.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 5 |
| Maximum | 7 |
| Range | 7 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.4209671 |
|---|---|
| Coefficient of variation (CV) | 0.57585524 |
| Kurtosis | -0.091205613 |
| Mean | 2.4675768 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.40380156 |
| Sum | 294604 |
| Variance | 2.0191474 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 56477 | |
| 3 | 24219 | |
| 5 | 19811 | 16.6% |
| 0 | 12606 | 10.6% |
| 1 | 5295 | 4.4% |
| 4 | 743 | 0.6% |
| 7 | 237 | 0.2% |
| 6 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 12606 | 10.6% |
| 1 | 5295 | 4.4% |
| 2 | 56477 | |
| 3 | 24219 | |
| 4 | 743 | 0.6% |
| 5 | 19811 | 16.6% |
| 6 | 2 | < 0.1% |
| 7 | 237 | 0.2% |
| Value | Count | Frequency (%) |
| 7 | 237 | 0.2% |
| 6 | 2 | < 0.1% |
| 5 | 19811 | 16.6% |
| 4 | 743 | 0.6% |
| 3 | 24219 | |
| 2 | 56477 | |
| 1 | 5295 | 4.4% |
| 0 | 12606 | 10.6% |
distribution_channel
Categorical
IMBALANCE 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| 2 | |
|---|---|
| 0 | |
| 1 | 6677 |
| 4 | 193 |
| 3 | 5 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 119390 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 2 |
Common Values
| Value | Count | Frequency (%) |
| 2 | 97870 | |
| 0 | 14645 | 12.3% |
| 1 | 6677 | 5.6% |
| 4 | 193 | 0.2% |
| 3 | 5 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2 | 97870 | |
| 0 | 14645 | 12.3% |
| 1 | 6677 | 5.6% |
| 4 | 193 | 0.2% |
| 3 | 5 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 97870 | |
| 0 | 14645 | 12.3% |
| 1 | 6677 | 5.6% |
| 4 | 193 | 0.2% |
| 3 | 5 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 119390 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 97870 | |
| 0 | 14645 | 12.3% |
| 1 | 6677 | 5.6% |
| 4 | 193 | 0.2% |
| 3 | 5 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 119390 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 97870 | |
| 0 | 14645 | 12.3% |
| 1 | 6677 | 5.6% |
| 4 | 193 | 0.2% |
| 3 | 5 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 119390 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 97870 | |
| 0 | 14645 | 12.3% |
| 1 | 6677 | 5.6% |
| 4 | 193 | 0.2% |
| 3 | 5 | < 0.1% |
repeated_guest
Categorical
IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| 0 | |
|---|---|
| 1 | 3810 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 119390 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 115580 | |
| 1 | 3810 | 3.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 115580 | |
| 1 | 3810 | 3.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 115580 | |
| 1 | 3810 | 3.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 119390 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 115580 | |
| 1 | 3810 | 3.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 119390 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 115580 | |
| 1 | 3810 | 3.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 119390 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 115580 | |
| 1 | 3810 | 3.2% |
num_previous_cancellations
Real number (ℝ)
SKEWED  ZEROS 
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.087117849 |
| Minimum | 0 |
|---|---|
| Maximum | 26 |
| Zeros | 112906 |
| Zeros (%) | 94.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 26 |
| Range | 26 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.84433638 |
|---|---|
| Coefficient of variation (CV) | 9.6918874 |
| Kurtosis | 674.07369 |
| Mean | 0.087117849 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 24.458049 |
| Sum | 10401 |
| Variance | 0.71290393 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 112906 | |
| 1 | 6051 | 5.1% |
| 2 | 116 | 0.1% |
| 3 | 65 | 0.1% |
| 24 | 48 | < 0.1% |
| 11 | 35 | < 0.1% |
| 4 | 31 | < 0.1% |
| 26 | 26 | < 0.1% |
| 25 | 25 | < 0.1% |
| 6 | 22 | < 0.1% |
| Other values (5) | 65 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 112906 | |
| 1 | 6051 | 5.1% |
| 2 | 116 | 0.1% |
| 3 | 65 | 0.1% |
| 4 | 31 | < 0.1% |
| 5 | 19 | < 0.1% |
| 6 | 22 | < 0.1% |
| 11 | 35 | < 0.1% |
| 13 | 12 | < 0.1% |
| 14 | 14 | < 0.1% |
| Value | Count | Frequency (%) |
| 26 | 26 | |
| 25 | 25 | |
| 24 | 48 | |
| 21 | 1 | < 0.1% |
| 19 | 19 | < 0.1% |
| 14 | 14 | < 0.1% |
| 13 | 12 | < 0.1% |
| 11 | 35 | |
| 6 | 22 | |
| 5 | 19 | < 0.1% |
num_previous_stays
Real number (ℝ)
SKEWED  ZEROS 
| Distinct | 73 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.13709691 |
| Minimum | 0 |
|---|---|
| Maximum | 72 |
| Zeros | 115770 |
| Zeros (%) | 97.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 72 |
| Range | 72 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.4974368 |
|---|---|
| Coefficient of variation (CV) | 10.92247 |
| Kurtosis | 767.24521 |
| Mean | 0.13709691 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 23.5398 |
| Sum | 16368 |
| Variance | 2.2423171 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 115770 | |
| 1 | 1542 | 1.3% |
| 2 | 580 | 0.5% |
| 3 | 333 | 0.3% |
| 4 | 229 | 0.2% |
| 5 | 181 | 0.2% |
| 6 | 115 | 0.1% |
| 7 | 88 | 0.1% |
| 8 | 70 | 0.1% |
| 9 | 60 | 0.1% |
| Other values (63) | 422 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 115770 | |
| 1 | 1542 | 1.3% |
| 2 | 580 | 0.5% |
| 3 | 333 | 0.3% |
| 4 | 229 | 0.2% |
| 5 | 181 | 0.2% |
| 6 | 115 | 0.1% |
| 7 | 88 | 0.1% |
| 8 | 70 | 0.1% |
| 9 | 60 | 0.1% |
| Value | Count | Frequency (%) |
| 72 | 1 | |
| 71 | 1 | |
| 70 | 1 | |
| 69 | 1 | |
| 68 | 1 | |
| 67 | 1 | |
| 66 | 1 | |
| 65 | 1 | |
| 64 | 1 | |
| 63 | 1 |
reserved_room
Categorical
IMBALANCE 
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| A | |
|---|---|
| D | |
| E | 6535 |
| F | 2897 |
| G | 2094 |
| Other values (5) | 2669 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 119390 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | C |
|---|---|
| 2nd row | C |
| 3rd row | A |
| 4th row | A |
| 5th row | A |
Common Values
| Value | Count | Frequency (%) |
| A | 85994 | |
| D | 19201 | 16.1% |
| E | 6535 | 5.5% |
| F | 2897 | 2.4% |
| G | 2094 | 1.8% |
| B | 1118 | 0.9% |
| C | 932 | 0.8% |
| H | 601 | 0.5% |
| P | 12 | < 0.1% |
| L | 6 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| a | 85994 | |
| d | 19201 | 16.1% |
| e | 6535 | 5.5% |
| f | 2897 | 2.4% |
| g | 2094 | 1.8% |
| b | 1118 | 0.9% |
| c | 932 | 0.8% |
| h | 601 | 0.5% |
| p | 12 | < 0.1% |
| l | 6 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 85994 | |
| D | 19201 | 16.1% |
| E | 6535 | 5.5% |
| F | 2897 | 2.4% |
| G | 2094 | 1.8% |
| B | 1118 | 0.9% |
| C | 932 | 0.8% |
| H | 601 | 0.5% |
| P | 12 | < 0.1% |
| L | 6 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 119390 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 85994 | |
| D | 19201 | 16.1% |
| E | 6535 | 5.5% |
| F | 2897 | 2.4% |
| G | 2094 | 1.8% |
| B | 1118 | 0.9% |
| C | 932 | 0.8% |
| H | 601 | 0.5% |
| P | 12 | < 0.1% |
| L | 6 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 119390 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 85994 | |
| D | 19201 | 16.1% |
| E | 6535 | 5.5% |
| F | 2897 | 2.4% |
| G | 2094 | 1.8% |
| B | 1118 | 0.9% |
| C | 932 | 0.8% |
| H | 601 | 0.5% |
| P | 12 | < 0.1% |
| L | 6 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 119390 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 85994 | |
| D | 19201 | 16.1% |
| E | 6535 | 5.5% |
| F | 2897 | 2.4% |
| G | 2094 | 1.8% |
| B | 1118 | 0.9% |
| C | 932 | 0.8% |
| H | 601 | 0.5% |
| P | 12 | < 0.1% |
| L | 6 | < 0.1% |
changes_between_booking_arrival
Real number (ℝ)
ZEROS 
| Distinct | 21 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.22112405 |
| Minimum | 0 |
|---|---|
| Maximum | 21 |
| Zeros | 101314 |
| Zeros (%) | 84.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 21 |
| Range | 21 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.65230557 |
|---|---|
| Coefficient of variation (CV) | 2.9499531 |
| Kurtosis | 79.393605 |
| Mean | 0.22112405 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 6.0002701 |
| Sum | 26400 |
| Variance | 0.42550256 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 101314 | |
| 1 | 12701 | 10.6% |
| 2 | 3805 | 3.2% |
| 3 | 927 | 0.8% |
| 4 | 376 | 0.3% |
| 5 | 118 | 0.1% |
| 6 | 63 | 0.1% |
| 7 | 31 | < 0.1% |
| 8 | 17 | < 0.1% |
| 9 | 8 | < 0.1% |
| Other values (11) | 30 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 101314 | |
| 1 | 12701 | 10.6% |
| 2 | 3805 | 3.2% |
| 3 | 927 | 0.8% |
| 4 | 376 | 0.3% |
| 5 | 118 | 0.1% |
| 6 | 63 | 0.1% |
| 7 | 31 | < 0.1% |
| 8 | 17 | < 0.1% |
| 9 | 8 | < 0.1% |
| Value | Count | Frequency (%) |
| 21 | 1 | < 0.1% |
| 20 | 1 | < 0.1% |
| 18 | 1 | < 0.1% |
| 17 | 2 | < 0.1% |
| 16 | 2 | < 0.1% |
| 15 | 3 | |
| 14 | 5 | |
| 13 | 5 | |
| 12 | 2 | < 0.1% |
| 11 | 2 | < 0.1% |
deposit_policy
Categorical
IMBALANCE 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| No Deposit | |
|---|---|
| Non Refund | |
| Refundable | 162 |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 1193900 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | No Deposit |
|---|---|
| 2nd row | No Deposit |
| 3rd row | No Deposit |
| 4th row | No Deposit |
| 5th row | No Deposit |
Common Values
| Value | Count | Frequency (%) |
| No Deposit | 104641 | |
| Non Refund | 14587 | 12.2% |
| Refundable | 162 | 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| no | 104641 | |
| deposit | 104641 | |
| non | 14587 | 6.1% |
| refund | 14587 | 6.1% |
| refundable | 162 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 223869 | |
| e | 119552 | |
| N | 119228 | |
| 119228 | ||
| s | 104641 | |
| i | 104641 | |
| t | 104641 | |
| p | 104641 | |
| D | 104641 | |
| n | 29336 | 2.5% |
| Other values (7) | 59482 | 5.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 836054 | |
| Uppercase Letter | 238618 | 20.0% |
| Space Separator | 119228 | 10.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 223869 | |
| e | 119552 | |
| s | 104641 | |
| i | 104641 | |
| t | 104641 | |
| p | 104641 | |
| n | 29336 | 3.5% |
| f | 14749 | 1.8% |
| u | 14749 | 1.8% |
| d | 14749 | 1.8% |
| Other values (3) | 486 | 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 119228 | |
| D | 104641 | |
| R | 14749 | 6.2% |
Space Separator
| Value | Count | Frequency (%) |
| 119228 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1074672 | |
| Common | 119228 | 10.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 223869 | |
| e | 119552 | |
| N | 119228 | |
| s | 104641 | |
| i | 104641 | |
| t | 104641 | |
| p | 104641 | |
| D | 104641 | |
| n | 29336 | 2.7% |
| R | 14749 | 1.4% |
| Other values (6) | 44733 | 4.2% |
Common
| Value | Count | Frequency (%) |
| 119228 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1193900 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 223869 | |
| e | 119552 | |
| N | 119228 | |
| 119228 | ||
| s | 104641 | |
| i | 104641 | |
| t | 104641 | |
| p | 104641 | |
| D | 104641 | |
| n | 29336 | 2.5% |
| Other values (7) | 59482 | 5.0% |
id_travel_agency_booking
Real number (ℝ)
HIGH CORRELATION  MISSING 
| Distinct | 333 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 16340 |
| Missing (%) | 13.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 86.693382 |
| Minimum | 1 |
|---|---|
| Maximum | 535 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 9 |
| median | 14 |
| Q3 | 229 |
| 95-th percentile | 250 |
| Maximum | 535 |
| Range | 534 |
| Interquartile range (IQR) | 220 |
Descriptive statistics
| Standard deviation | 110.77455 |
|---|---|
| Coefficient of variation (CV) | 1.277774 |
| Kurtosis | -0.0071795649 |
| Mean | 86.693382 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 1.0893856 |
| Sum | 8933753 |
| Variance | 12271 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 9 | 31961 | |
| 240 | 13922 | |
| 1 | 7191 | 6.0% |
| 14 | 3640 | 3.0% |
| 7 | 3539 | 3.0% |
| 6 | 3290 | 2.8% |
| 250 | 2870 | 2.4% |
| 241 | 1721 | 1.4% |
| 28 | 1666 | 1.4% |
| 8 | 1514 | 1.3% |
| Other values (323) | 31736 | |
| (Missing) | 16340 |
| Value | Count | Frequency (%) |
| 1 | 7191 | 6.0% |
| 2 | 162 | 0.1% |
| 3 | 1336 | 1.1% |
| 4 | 47 | < 0.1% |
| 5 | 330 | 0.3% |
| 6 | 3290 | 2.8% |
| 7 | 3539 | 3.0% |
| 8 | 1514 | 1.3% |
| 9 | 31961 | |
| 10 | 260 | 0.2% |
| Value | Count | Frequency (%) |
| 535 | 3 | < 0.1% |
| 531 | 68 | |
| 527 | 35 | |
| 526 | 10 | < 0.1% |
| 510 | 2 | < 0.1% |
| 509 | 10 | < 0.1% |
| 508 | 6 | < 0.1% |
| 502 | 24 | < 0.1% |
| 497 | 1 | < 0.1% |
| 495 | 57 |
id_person_booking
Real number (ℝ)
MISSING 
| Distinct | 352 |
|---|---|
| Distinct (%) | 5.2% |
| Missing | 112593 |
| Missing (%) | 94.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 189.26674 |
| Minimum | 6 |
|---|---|
| Maximum | 543 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 6 |
|---|---|
| 5-th percentile | 40 |
| Q1 | 62 |
| median | 179 |
| Q3 | 270 |
| 95-th percentile | 435 |
| Maximum | 543 |
| Range | 537 |
| Interquartile range (IQR) | 208 |
Descriptive statistics
| Standard deviation | 131.65501 |
|---|---|
| Coefficient of variation (CV) | 0.69560567 |
| Kurtosis | -0.49079521 |
| Mean | 189.26674 |
| Median Absolute Deviation (MAD) | 111 |
| Skewness | 0.60159967 |
| Sum | 1286446 |
| Variance | 17333.043 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 40 | 927 | 0.8% |
| 223 | 784 | 0.7% |
| 67 | 267 | 0.2% |
| 45 | 250 | 0.2% |
| 153 | 215 | 0.2% |
| 174 | 149 | 0.1% |
| 219 | 141 | 0.1% |
| 281 | 138 | 0.1% |
| 154 | 133 | 0.1% |
| 405 | 119 | 0.1% |
| Other values (342) | 3674 | 3.1% |
| (Missing) | 112593 |
| Value | Count | Frequency (%) |
| 6 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 9 | 37 | |
| 10 | 1 | < 0.1% |
| 11 | 1 | < 0.1% |
| 12 | 14 | < 0.1% |
| 14 | 9 | < 0.1% |
| 16 | 5 | < 0.1% |
| 18 | 1 | < 0.1% |
| 20 | 50 |
| Value | Count | Frequency (%) |
| 543 | 2 | < 0.1% |
| 541 | 1 | < 0.1% |
| 539 | 2 | < 0.1% |
| 534 | 2 | < 0.1% |
| 531 | 1 | < 0.1% |
| 530 | 5 | < 0.1% |
| 528 | 2 | < 0.1% |
| 525 | 15 | |
| 523 | 19 | |
| 521 | 7 | < 0.1% |
customer_type
Categorical
IMBALANCE 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| 0 | |
|---|---|
| 2 | |
| 1 | 4076 |
| 3 | 577 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 119390 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 89613 | |
| 2 | 25124 | 21.0% |
| 1 | 4076 | 3.4% |
| 3 | 577 | 0.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 89613 | |
| 2 | 25124 | 21.0% |
| 1 | 4076 | 3.4% |
| 3 | 577 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 89613 | |
| 2 | 25124 | 21.0% |
| 1 | 4076 | 3.4% |
| 3 | 577 | 0.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 119390 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 89613 | |
| 2 | 25124 | 21.0% |
| 1 | 4076 | 3.4% |
| 3 | 577 | 0.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 119390 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 89613 | |
| 2 | 25124 | 21.0% |
| 1 | 4076 | 3.4% |
| 3 | 577 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 119390 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 89613 | |
| 2 | 25124 | 21.0% |
| 1 | 4076 | 3.4% |
| 3 | 577 | 0.5% |
avg_price
Real number (ℝ)
ZEROS 
| Distinct | 8726 |
|---|---|
| Distinct (%) | 7.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 101.71874 |
| Minimum | 0 |
|---|---|
| Maximum | 300 |
| Zeros | 1960 |
| Zeros (%) | 1.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 38.4 |
| Q1 | 69.29 |
| median | 94.575 |
| Q3 | 126 |
| 95-th percentile | 193.5 |
| Maximum | 300 |
| Range | 300 |
| Interquartile range (IQR) | 56.71 |
Descriptive statistics
| Standard deviation | 47.823771 |
|---|---|
| Coefficient of variation (CV) | 0.47015691 |
| Kurtosis | 1.5972017 |
| Mean | 101.71874 |
| Median Absolute Deviation (MAD) | 27.825 |
| Skewness | 0.94134242 |
| Sum | 12144201 |
| Variance | 2287.1131 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 62 | 3754 | 3.1% |
| 75 | 2715 | 2.3% |
| 90 | 2473 | 2.1% |
| 65 | 2418 | 2.0% |
| 0 | 1960 | 1.6% |
| 80 | 1889 | 1.6% |
| 95 | 1661 | 1.4% |
| 120 | 1607 | 1.3% |
| 100 | 1573 | 1.3% |
| 85 | 1538 | 1.3% |
| Other values (8716) | 97802 |
| Value | Count | Frequency (%) |
| 0 | 1960 | |
| 0.26 | 1 | < 0.1% |
| 0.5 | 1 | < 0.1% |
| 1 | 15 | < 0.1% |
| 1.29 | 1 | < 0.1% |
| 1.48 | 1 | < 0.1% |
| 1.56 | 2 | < 0.1% |
| 1.6 | 1 | < 0.1% |
| 1.8 | 1 | < 0.1% |
| 2 | 12 | < 0.1% |
| Value | Count | Frequency (%) |
| 300 | 290 | |
| 299.43 | 1 | < 0.1% |
| 299.33 | 2 | < 0.1% |
| 299.2 | 1 | < 0.1% |
| 299 | 10 | < 0.1% |
| 298.71 | 1 | < 0.1% |
| 298 | 5 | < 0.1% |
| 297.57 | 1 | < 0.1% |
| 297.5 | 1 | < 0.1% |
| 297.38 | 1 | < 0.1% |
required_car_parking_spaces
Categorical
IMBALANCE 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| 0 | |
|---|---|
| 1 | 7383 |
| 2 | 28 |
| 3 | 3 |
| 8 | 2 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 119390 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 111974 | |
| 1 | 7383 | 6.2% |
| 2 | 28 | < 0.1% |
| 3 | 3 | < 0.1% |
| 8 | 2 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 111974 | |
| 1 | 7383 | 6.2% |
| 2 | 28 | < 0.1% |
| 3 | 3 | < 0.1% |
| 8 | 2 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 111974 | |
| 1 | 7383 | 6.2% |
| 2 | 28 | < 0.1% |
| 3 | 3 | < 0.1% |
| 8 | 2 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 119390 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 111974 | |
| 1 | 7383 | 6.2% |
| 2 | 28 | < 0.1% |
| 3 | 3 | < 0.1% |
| 8 | 2 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 119390 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 111974 | |
| 1 | 7383 | 6.2% |
| 2 | 28 | < 0.1% |
| 3 | 3 | < 0.1% |
| 8 | 2 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 119390 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 111974 | |
| 1 | 7383 | 6.2% |
| 2 | 28 | < 0.1% |
| 3 | 3 | < 0.1% |
| 8 | 2 | < 0.1% |
total_of_special_requests
Real number (ℝ)
ZEROS 
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.57136276 |
| Minimum | 0 |
|---|---|
| Maximum | 5 |
| Zeros | 70318 |
| Zeros (%) | 58.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.79279842 |
|---|---|
| Coefficient of variation (CV) | 1.387557 |
| Kurtosis | 1.4925648 |
| Mean | 0.57136276 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.3491894 |
| Sum | 68215 |
| Variance | 0.62852934 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 70318 | |
| 1 | 33226 | |
| 2 | 12969 | 10.9% |
| 3 | 2497 | 2.1% |
| 4 | 340 | 0.3% |
| 5 | 40 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 70318 | |
| 1 | 33226 | |
| 2 | 12969 | 10.9% |
| 3 | 2497 | 2.1% |
| 4 | 340 | 0.3% |
| 5 | 40 | < 0.1% |
| Value | Count | Frequency (%) |
| 5 | 40 | < 0.1% |
| 4 | 340 | 0.3% |
| 3 | 2497 | 2.1% |
| 2 | 12969 | 10.9% |
| 1 | 33226 | |
| 0 | 70318 |
| avg_price | breakfast | cancellation | changes_between_booking_arrival | customer_type | day_of_month_arrival_date | days_between_booking_arrival | deposit_policy | distribution_channel | id_person_booking | id_travel_agency_booking | market_segment | month_arrival_date | num_adults | num_babies | num_children | num_previous_cancellations | num_previous_stays | num_weekend_nights | num_workweek_nights | repeated_guest | required_car_parking_spaces | reserved_room | total_of_special_requests | type | week_number_arrival_date | year_arrival_date | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| avg_price | 1.000 | 0.135 | 0.118 | 0.005 | 0.126 | 0.027 | 0.015 | 0.131 | 0.119 | 0.052 | -0.049 | -0.259 | 0.181 | 0.280 | 0.022 | 0.204 | -0.150 | -0.143 | 0.051 | 0.094 | 0.190 | 0.055 | 0.191 | 0.196 | 0.471 | 0.074 | 0.153 |
| breakfast | 0.135 | 1.000 | 0.013 | -0.049 | 0.078 | 0.009 | -0.063 | 0.072 | 0.121 | -0.036 | 0.036 | -0.065 | 0.081 | -0.051 | 0.009 | 0.037 | 0.033 | 0.066 | -0.061 | -0.043 | 0.060 | 0.032 | 0.121 | -0.021 | 0.041 | 0.000 | 0.039 |
| cancellation | 0.118 | 0.013 | 1.000 | -0.185 | 0.136 | -0.006 | 0.317 | 0.481 | 0.177 | -0.011 | -0.115 | 0.215 | 0.070 | 0.067 | 0.034 | 0.028 | 0.270 | -0.115 | -0.004 | 0.041 | 0.085 | 0.197 | 0.073 | -0.259 | 0.136 | 0.008 | 0.026 |
| changes_between_booking_arrival | 0.005 | -0.049 | -0.185 | 1.000 | 0.028 | 0.012 | -0.008 | 0.029 | 0.027 | 0.176 | 0.091 | -0.071 | 0.010 | -0.085 | 0.017 | 0.018 | -0.073 | 0.031 | 0.040 | 0.064 | 0.000 | 0.016 | 0.014 | 0.042 | 0.040 | 0.008 | 0.016 |
| customer_type | 0.126 | 0.078 | 0.136 | 0.028 | 1.000 | 0.002 | 0.159 | 0.098 | 0.079 | 0.310 | -0.042 | 0.373 | 0.103 | -0.124 | 0.015 | 0.061 | 0.096 | -0.036 | -0.035 | -0.028 | 0.105 | 0.041 | 0.109 | -0.146 | 0.052 | 0.076 | 0.213 |
| day_of_month_arrival_date | 0.027 | 0.009 | -0.006 | 0.012 | 0.002 | 1.000 | 0.008 | 0.054 | 0.028 | 0.046 | 0.005 | 0.001 | 0.058 | 0.002 | 0.005 | 0.010 | -0.012 | -0.001 | -0.007 | -0.016 | 0.017 | 0.008 | 0.010 | 0.003 | 0.026 | 0.061 | 0.044 |
| days_between_booking_arrival | 0.015 | -0.063 | 0.317 | -0.008 | 0.159 | 0.008 | 1.000 | 0.273 | 0.116 | 0.286 | -0.123 | 0.409 | 0.132 | 0.192 | 0.007 | 0.028 | 0.171 | -0.189 | 0.162 | 0.296 | 0.134 | 0.057 | 0.048 | -0.074 | 0.094 | 0.113 | 0.104 |
| deposit_policy | 0.131 | 0.072 | 0.481 | 0.029 | 0.098 | 0.054 | 0.273 | 1.000 | 0.091 | 0.022 | -0.137 | 0.455 | 0.101 | -0.029 | 0.023 | 0.073 | 0.318 | -0.064 | -0.116 | -0.055 | 0.058 | 0.071 | 0.152 | -0.302 | 0.177 | 0.006 | 0.052 |
| distribution_channel | 0.119 | 0.121 | 0.177 | 0.027 | 0.079 | 0.028 | 0.116 | 0.091 | 1.000 | 0.136 | -0.218 | 0.481 | 0.069 | 0.157 | 0.029 | 0.043 | 0.020 | -0.245 | 0.087 | 0.102 | 0.297 | 0.076 | 0.100 | 0.091 | 0.187 | 0.009 | 0.027 |
| id_person_booking | 0.052 | -0.036 | -0.011 | 0.176 | 0.310 | 0.046 | 0.286 | 0.022 | 0.136 | 1.000 | 0.226 | 0.196 | 0.217 | 0.230 | 0.032 | 0.039 | -0.198 | -0.298 | 0.076 | 0.250 | 0.358 | 0.048 | 0.098 | -0.128 | 0.498 | -0.058 | 0.281 |
| id_travel_agency_booking | -0.049 | 0.036 | -0.115 | 0.091 | -0.042 | 0.005 | -0.123 | -0.137 | -0.218 | 0.226 | 1.000 | -0.116 | 0.083 | -0.056 | 0.026 | 0.058 | -0.168 | 0.060 | 0.131 | 0.170 | 0.076 | 0.131 | 0.143 | 0.015 | 0.817 | -0.057 | 0.091 |
| market_segment | -0.259 | -0.065 | 0.215 | -0.071 | 0.373 | 0.001 | 0.409 | 0.455 | 0.481 | 0.196 | -0.116 | 1.000 | 0.088 | -0.017 | 0.034 | 0.100 | 0.194 | -0.153 | 0.011 | 0.035 | 0.347 | 0.092 | 0.138 | -0.293 | 0.147 | 0.048 | 0.159 |
| month_arrival_date | 0.181 | 0.081 | 0.070 | 0.010 | 0.103 | 0.058 | 0.132 | 0.101 | 0.069 | 0.217 | 0.083 | 0.088 | 1.000 | -0.078 | 0.016 | 0.069 | 0.045 | -0.007 | -0.035 | -0.024 | 0.075 | 0.018 | 0.045 | -0.058 | 0.070 | 0.337 | 0.429 |
| num_adults | 0.280 | -0.051 | 0.067 | -0.085 | -0.124 | 0.002 | 0.192 | -0.029 | 0.157 | 0.230 | -0.056 | -0.017 | -0.078 | 1.000 | 0.000 | 0.000 | -0.036 | -0.210 | 0.127 | 0.153 | 0.000 | 0.000 | 0.003 | 0.162 | 0.014 | 0.026 | 0.015 |
| num_babies | 0.022 | 0.009 | 0.034 | 0.017 | 0.015 | 0.005 | 0.007 | 0.023 | 0.029 | 0.032 | 0.026 | 0.034 | 0.016 | 0.000 | 1.000 | 0.025 | -0.017 | -0.011 | 0.023 | 0.026 | 0.007 | 0.020 | 0.040 | 0.093 | 0.049 | 0.013 | 0.009 |
| num_children | 0.204 | 0.037 | 0.028 | 0.018 | 0.061 | 0.010 | 0.028 | 0.073 | 0.043 | 0.039 | 0.058 | 0.100 | 0.069 | 0.000 | 0.025 | 1.000 | -0.059 | -0.035 | 0.053 | 0.054 | 0.035 | 0.030 | 0.357 | 0.096 | 0.046 | 0.006 | 0.044 |
| num_previous_cancellations | -0.150 | 0.033 | 0.270 | -0.073 | 0.096 | -0.012 | 0.171 | 0.318 | 0.020 | -0.198 | -0.168 | 0.194 | 0.045 | -0.036 | -0.017 | -0.059 | 1.000 | 0.102 | -0.055 | -0.062 | 0.185 | 0.000 | 0.006 | -0.129 | 0.050 | 0.087 | 0.052 |
| num_previous_stays | -0.143 | 0.066 | -0.115 | 0.031 | -0.036 | -0.001 | -0.189 | -0.064 | -0.245 | -0.298 | 0.060 | -0.153 | -0.007 | -0.210 | -0.011 | -0.035 | 0.102 | 1.000 | -0.084 | -0.119 | 0.320 | 0.019 | 0.003 | 0.025 | 0.017 | -0.043 | 0.025 |
| num_weekend_nights | 0.051 | -0.061 | -0.004 | 0.040 | -0.035 | -0.007 | 0.162 | -0.116 | 0.087 | 0.076 | 0.131 | 0.011 | -0.035 | 0.127 | 0.023 | 0.053 | -0.055 | -0.084 | 1.000 | 0.238 | 0.082 | 0.015 | 0.054 | 0.079 | 0.198 | 0.026 | 0.029 |
| num_workweek_nights | 0.094 | -0.043 | 0.041 | 0.064 | -0.028 | -0.016 | 0.296 | -0.055 | 0.102 | 0.250 | 0.170 | 0.035 | -0.024 | 0.153 | 0.026 | 0.054 | -0.062 | -0.119 | 0.238 | 1.000 | 0.017 | 0.017 | 0.044 | 0.076 | 0.192 | 0.026 | 0.014 |
| repeated_guest | 0.190 | 0.060 | 0.085 | 0.000 | 0.105 | 0.017 | 0.134 | 0.058 | 0.297 | 0.358 | 0.076 | 0.347 | 0.075 | 0.000 | 0.007 | 0.035 | 0.185 | 0.320 | 0.082 | 0.017 | 1.000 | 0.078 | 0.037 | 0.006 | 0.050 | -0.030 | 0.010 |
| required_car_parking_spaces | 0.055 | 0.032 | 0.197 | 0.016 | 0.041 | 0.008 | 0.057 | 0.071 | 0.076 | 0.048 | 0.131 | 0.092 | 0.018 | 0.000 | 0.020 | 0.030 | 0.000 | 0.019 | 0.015 | 0.017 | 0.078 | 1.000 | 0.079 | 0.088 | 0.221 | 0.003 | 0.018 |
| reserved_room | 0.191 | 0.121 | 0.073 | 0.014 | 0.109 | 0.010 | 0.048 | 0.152 | 0.100 | 0.098 | 0.143 | 0.138 | 0.045 | 0.003 | 0.040 | 0.357 | 0.006 | 0.003 | 0.054 | 0.044 | 0.037 | 0.079 | 1.000 | 0.152 | 0.323 | -0.011 | 0.082 |
| total_of_special_requests | 0.196 | -0.021 | -0.259 | 0.042 | -0.146 | 0.003 | -0.074 | -0.302 | 0.091 | -0.128 | 0.015 | -0.293 | -0.058 | 0.162 | 0.093 | 0.096 | -0.129 | 0.025 | 0.079 | 0.076 | 0.006 | 0.088 | 0.152 | 1.000 | 0.046 | 0.019 | 0.091 |
| type | 0.471 | 0.041 | 0.136 | 0.040 | 0.052 | 0.026 | 0.094 | 0.177 | 0.187 | 0.498 | 0.817 | 0.147 | 0.070 | 0.014 | 0.049 | 0.046 | 0.050 | 0.017 | 0.198 | 0.192 | 0.050 | 0.221 | 0.323 | 0.046 | 1.000 | 0.001 | 0.043 |
| week_number_arrival_date | 0.074 | 0.000 | 0.008 | 0.008 | 0.076 | 0.061 | 0.113 | 0.006 | 0.009 | -0.058 | -0.057 | 0.048 | 0.337 | 0.026 | 0.013 | 0.006 | 0.087 | -0.043 | 0.026 | 0.026 | -0.030 | 0.003 | -0.011 | 0.019 | 0.001 | 1.000 | 0.424 |
| year_arrival_date | 0.153 | 0.039 | 0.026 | 0.016 | 0.213 | 0.044 | 0.104 | 0.052 | 0.027 | 0.281 | 0.091 | 0.159 | 0.429 | 0.015 | 0.009 | 0.044 | 0.052 | 0.025 | 0.029 | 0.014 | 0.010 | 0.018 | 0.082 | 0.091 | 0.043 | 0.424 | 1.000 |
| cancellation | type | days_between_booking_arrival | year_arrival_date | month_arrival_date | week_number_arrival_date | day_of_month_arrival_date | num_weekend_nights | num_workweek_nights | num_adults | num_children | num_babies | breakfast | country | market_segment | distribution_channel | repeated_guest | num_previous_cancellations | num_previous_stays | reserved_room | changes_between_booking_arrival | deposit_policy | id_travel_agency_booking | id_person_booking | customer_type | avg_price | required_car_parking_spaces | total_of_special_requests | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | Fancy Hotel | 342 | 2015 | July | 27 | 1 | 0 | 0 | 2 | 0.0 | 0 | True | PRT | 0 | 0 | 0 | 0 | 0 | C | 3 | No Deposit | NaN | NaN | 0 | 0.0 | 0 | 0 |
| 1 | 0 | Fancy Hotel | 737 | 2015 | July | 27 | 1 | 0 | 0 | 2 | 0.0 | 0 | True | PRT | 0 | 0 | 0 | 0 | 0 | C | 4 | No Deposit | NaN | NaN | 0 | 0.0 | 0 | 0 |
| 2 | 0 | Fancy Hotel | 7 | 2015 | July | 27 | 1 | 0 | 1 | 1 | 0.0 | 0 | True | GBR | 0 | 0 | 0 | 0 | 0 | A | 0 | No Deposit | NaN | NaN | 0 | 75.0 | 0 | 0 |
| 3 | 0 | Fancy Hotel | 13 | 2015 | July | 27 | 1 | 0 | 1 | 1 | 0.0 | 0 | True | GBR | 1 | 1 | 0 | 0 | 0 | A | 0 | No Deposit | 304.0 | NaN | 0 | 75.0 | 0 | 0 |
| 4 | 0 | Fancy Hotel | 14 | 2015 | July | 27 | 1 | 0 | 2 | 2 | 0.0 | 0 | True | GBR | 2 | 2 | 0 | 0 | 0 | A | 0 | No Deposit | 240.0 | NaN | 0 | 98.0 | 0 | 1 |
| 5 | 0 | Fancy Hotel | 14 | 2015 | July | 27 | 1 | 0 | 2 | 2 | 0.0 | 0 | True | GBR | 2 | 2 | 0 | 0 | 0 | A | 0 | No Deposit | 240.0 | NaN | 0 | 98.0 | 0 | 1 |
| 6 | 0 | Fancy Hotel | 0 | 2015 | July | 27 | 1 | 0 | 2 | 2 | 0.0 | 0 | True | PRT | 0 | 0 | 0 | 0 | 0 | C | 0 | No Deposit | NaN | NaN | 0 | 107.0 | 0 | 0 |
| 7 | 0 | Fancy Hotel | 9 | 2015 | July | 27 | 1 | 0 | 2 | 2 | 0.0 | 0 | False | PRT | 0 | 0 | 0 | 0 | 0 | C | 0 | No Deposit | 303.0 | NaN | 0 | 103.0 | 0 | 1 |
| 8 | 1 | Fancy Hotel | 85 | 2015 | July | 27 | 1 | 0 | 3 | 2 | 0.0 | 0 | True | PRT | 2 | 2 | 0 | 0 | 0 | A | 0 | No Deposit | 240.0 | NaN | 0 | 82.0 | 0 | 1 |
| 9 | 1 | Fancy Hotel | 75 | 2015 | July | 27 | 1 | 0 | 3 | 2 | 0.0 | 0 | False | PRT | 3 | 2 | 0 | 0 | 0 | D | 0 | No Deposit | 15.0 | NaN | 0 | 105.5 | 0 | 0 |
| cancellation | type | days_between_booking_arrival | year_arrival_date | month_arrival_date | week_number_arrival_date | day_of_month_arrival_date | num_weekend_nights | num_workweek_nights | num_adults | num_children | num_babies | breakfast | country | market_segment | distribution_channel | repeated_guest | num_previous_cancellations | num_previous_stays | reserved_room | changes_between_booking_arrival | deposit_policy | id_travel_agency_booking | id_person_booking | customer_type | avg_price | required_car_parking_spaces | total_of_special_requests | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 119380 | 0 | Hotel | 44 | 2017 | August | 35 | 31 | 1 | 3 | 2 | 0.0 | 0 | False | DEU | 2 | 2 | 0 | 0 | 0 | A | 0 | No Deposit | 9.0 | NaN | 0 | 140.75 | 0 | 1 |
| 119381 | 0 | Hotel | 188 | 2017 | August | 35 | 31 | 2 | 3 | 2 | 0.0 | 0 | True | DEU | 0 | 0 | 0 | 0 | 0 | A | 0 | No Deposit | 14.0 | NaN | 0 | 99.00 | 0 | 0 |
| 119382 | 0 | Hotel | 135 | 2017 | August | 35 | 30 | 2 | 4 | 3 | 0.0 | 0 | True | JPN | 2 | 2 | 0 | 0 | 0 | G | 0 | No Deposit | 7.0 | NaN | 0 | 209.00 | 0 | 0 |
| 119383 | 0 | Hotel | 164 | 2017 | August | 35 | 31 | 2 | 4 | 2 | 0.0 | 0 | True | DEU | 3 | 2 | 0 | 0 | 0 | A | 0 | No Deposit | 42.0 | NaN | 0 | 87.60 | 0 | 0 |
| 119384 | 0 | Hotel | 21 | 2017 | August | 35 | 30 | 2 | 5 | 2 | 0.0 | 0 | True | BEL | 3 | 2 | 0 | 0 | 0 | A | 0 | No Deposit | 394.0 | NaN | 0 | 96.14 | 0 | 2 |
| 119385 | 0 | Hotel | 23 | 2017 | August | 35 | 30 | 2 | 5 | 2 | 0.0 | 0 | True | BEL | 3 | 2 | 0 | 0 | 0 | A | 0 | No Deposit | 394.0 | NaN | 0 | 96.14 | 0 | 0 |
| 119386 | 0 | Hotel | 102 | 2017 | August | 35 | 31 | 2 | 5 | 3 | 0.0 | 0 | True | FRA | 2 | 2 | 0 | 0 | 0 | E | 0 | No Deposit | 9.0 | NaN | 0 | 225.43 | 0 | 2 |
| 119387 | 0 | Hotel | 34 | 2017 | August | 35 | 31 | 2 | 5 | 2 | 0.0 | 0 | True | DEU | 2 | 2 | 0 | 0 | 0 | D | 0 | No Deposit | 9.0 | NaN | 0 | 157.71 | 0 | 4 |
| 119388 | 0 | Hotel | 109 | 2017 | August | 35 | 31 | 2 | 5 | 2 | 0.0 | 0 | True | GBR | 2 | 2 | 0 | 0 | 0 | A | 0 | No Deposit | 89.0 | NaN | 0 | 104.40 | 0 | 0 |
| 119389 | 0 | Hotel | 205 | 2017 | August | 35 | 29 | 2 | 7 | 2 | 0.0 | 0 | False | DEU | 2 | 2 | 0 | 0 | 0 | A | 0 | No Deposit | 9.0 | NaN | 0 | 151.20 | 0 | 2 |
Most frequently occurring
| cancellation | type | days_between_booking_arrival | year_arrival_date | month_arrival_date | week_number_arrival_date | day_of_month_arrival_date | num_weekend_nights | num_workweek_nights | num_adults | num_children | num_babies | breakfast | country | market_segment | distribution_channel | repeated_guest | num_previous_cancellations | num_previous_stays | reserved_room | changes_between_booking_arrival | deposit_policy | id_travel_agency_booking | id_person_booking | customer_type | avg_price | required_car_parking_spaces | total_of_special_requests | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 7805 | 1 | Hotel | 277 | 2016 | November | 46 | 7 | 1 | 2 | 2 | 0.0 | 0 | True | PRT | 5 | 2 | 0 | 0 | 0 | A | 0 | Non Refund | NaN | NaN | 0 | 100.0 | 0 | 0 | 180 |
| 6587 | 1 | Hotel | 68 | 2016 | February | 8 | 17 | 0 | 2 | 2 | 0.0 | 0 | True | PRT | 5 | 2 | 0 | 1 | 0 | A | 0 | Non Refund | 37.0 | NaN | 0 | 75.0 | 0 | 0 | 150 |
| 6244 | 1 | Hotel | 34 | 2015 | December | 50 | 8 | 0 | 2 | 1 | 0.0 | 0 | True | PRT | 3 | 2 | 0 | 1 | 0 | A | 0 | Non Refund | 19.0 | NaN | 0 | 90.0 | 0 | 0 | 140 |
| 7476 | 1 | Hotel | 188 | 2016 | June | 25 | 15 | 0 | 2 | 1 | 0.0 | 0 | True | PRT | 3 | 2 | 0 | 0 | 0 | A | 0 | Non Refund | 119.0 | NaN | 0 | 130.0 | 0 | 0 | 109 |
| 7285 | 1 | Hotel | 158 | 2016 | May | 22 | 24 | 0 | 2 | 1 | 0.0 | 0 | True | PRT | 5 | 2 | 0 | 0 | 0 | A | 0 | Non Refund | 37.0 | NaN | 0 | 130.0 | 0 | 0 | 101 |
| 6187 | 1 | Hotel | 28 | 2017 | March | 9 | 2 | 0 | 3 | 2 | 0.0 | 0 | True | PRT | 5 | 2 | 0 | 0 | 0 | A | 0 | Non Refund | NaN | NaN | 0 | 95.0 | 0 | 0 | 99 |
| 6307 | 1 | Hotel | 38 | 2017 | January | 2 | 14 | 0 | 1 | 1 | 0.0 | 0 | True | PRT | 1 | 1 | 0 | 0 | 0 | A | 0 | Non Refund | NaN | 67.0 | 0 | 75.0 | 0 | 0 | 99 |
| 7278 | 1 | Hotel | 156 | 2017 | April | 17 | 26 | 0 | 3 | 2 | 0.0 | 0 | True | PRT | 5 | 2 | 0 | 0 | 0 | A | 0 | Non Refund | 37.0 | NaN | 0 | 100.0 | 0 | 0 | 99 |
| 6612 | 1 | Hotel | 71 | 2016 | June | 25 | 14 | 0 | 3 | 1 | 0.0 | 0 | True | PRT | 3 | 2 | 0 | 0 | 0 | A | 0 | Non Refund | 236.0 | NaN | 0 | 120.0 | 0 | 0 | 89 |
| 4212 | 0 | Hotel | 164 | 2015 | October | 40 | 2 | 0 | 2 | 1 | 0.0 | 0 | True | PRT | 3 | 2 | 0 | 0 | 0 | A | 0 | No Deposit | 19.0 | NaN | 2 | 100.0 | 0 | 0 | 87 |